Skip to content

Add pinned post support to feed generator#149

Merged
rudyfraser merged 42 commits into
mainfrom
rude1/feedgen-pinned-post
Mar 11, 2026
Merged

Add pinned post support to feed generator#149
rudyfraser merged 42 commits into
mainfrom
rude1/feedgen-pinned-post

Conversation

@rudyfraser

Copy link
Copy Markdown
Member

Summary

  • Adds PINNED_POST_URI environment variable to configure a pinned post
  • When set, the pinned post is inserted at position 0 on first-page requests (no cursor)
  • Paginated scroll requests are unaffected — pinned post only shows on initial load and pull-to-refresh
  • Banned users continue to see only the banned notice post

Test plan

  • Set PINNED_POST_URI to a valid AT URI and verify it appears first on initial feed load
  • Scroll down and verify the pinned post does not reappear in paginated results
  • Pull to refresh and verify the pinned post reappears at the top
  • Verify banned users do not see the pinned post
  • Unset PINNED_POST_URI and verify feeds behave as before

rudyfraser and others added 30 commits January 15, 2026 18:03
The bulk insert paths for posts and follows were not updating
profile_agg counters (postsCount, followsCount, followersCount).
This caused profiles to show 0 counts despite having posts/follows.

Added profile_agg updates after bulk inserts:
- copy_insert_posts: Update postsCount for creators
- copy_insert_follows: Update followsCount for creators,
  followersCount for subjects
The quote table was missing sortAt which is required by
the getQuotes dataplane route for pagination.
The posts_with_media filter in getAuthorFeed was returning empty results
because the wintermute indexer was not populating the post_embed_image
and post_embed_video tables.

Changes:
- Update handle_post_embeds() to detect and process image/video embeds
- Add handle_embed_images() and handle_embed_video() for single-record indexing
- Add extract_embed_data(), extract_images(), extract_video() for bulk processing
- Add copy_insert_post_embed_images() and copy_insert_post_embed_videos()
  bulk functions using COPY protocol
- Update copy_batch_insert_posts() to extract and insert embed data

This fixes the media tab showing empty on user profiles.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two migration scripts:
- backfill_post_embeds.sql: Single-shot migration for smaller datasets
- backfill_post_embeds_batched.sql: Batched approach for large tables

Run after deploying the indexer fix to populate post_embed_image and
post_embed_video for existing posts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ON CONFLICT DO NOTHING to all notification INSERT statements:
  - like notifications (was missing)
  - follow notifications (was missing)
  - repost notifications (was missing)
  - starterpack-joined notifications (was missing)
  - reply and quote already had it

- Add migration scripts:
  - dedupe_notifications.sql: Full deduplication for large tables
  - add_notification_unique_constraint.sql: Adds unique constraint

- Fix clippy warnings:
  - Use ToOwned::to_owned instead of closure
  - Use map_or_else instead of match for option handling
Implements app.bsky.video.* lexicon endpoints using Bunny Stream
for video transcoding and CDN delivery.

Features:
- getUploadLimits: Check user quotas
- uploadVideo: Upload video to Bunny Stream
- getJobStatus: Poll transcoding status
- Bunny webhook handler for encoding completion
- URL proxy mapping did/cid to Bunny video IDs

Configuration:
- BUNNY_LIBRARY_ID, BUNNY_API_KEY, BUNNY_PULL_ZONE
- VIDEO_SERVICE_DID, VIDEO_PUBLIC_URL
- DATABASE_URL for job tracking
Previously the video service was using Bunny's UUID as the blob $link,
which is invalid per AT Protocol spec. CIDs must be content-addressed
hashes (e.g., bafkreibjfgx2gprinfvicegelk5kosd6y2frmqpqzwqkg7usac74l3t2v4).

Changes:
- Add CID generation using cid and multihash-codetable crates
- Generate CIDv1 (raw codec 0x55, SHA-256) from video bytes during upload
- Store video_cid in job record for later use
- Use proper CID in blob reference when webhook completes job
- Update video_mappings to use content CID instead of Bunny UUID

This ensures video blobs have valid content-addressed identifiers that
comply with the AT Protocol specification.
Implements proper AT Protocol blob flow:
- Video service now uploads blob to user's PDS first using forwarded service auth token
- PDS returns valid blob_ref which is stored in video_jobs.pds_blob_ref
- Client can then reference the blob in posts without BlobNotFound errors

Key changes:
- Add pds/ module using atrium for AT Protocol operations
- Add pds_blob_ref column to video_jobs table
- Extract PDS DID from token's aud claim
- Resolve did:web directly, did:plc via plc.directory
- Upload blob to PDS, then to Bunny for transcoding
Video service now has its own identity (did:web:video.blacksky.community)
and creates service auth tokens to upload blobs to user PDSs.

Changes:
- Add signing module with K-256 JWT signing
- Load signing key from SIGNING_KEY_PATH env var
- Create service auth tokens with iss=video_service, aud=pds, sub=user
- Update PDS client to resolve user DID to their PDS endpoint
- Update Cargo.toml with k256 and sec1 dependencies
The atrium XRPC client was not properly forwarding the Authorization header,
resulting in 'Bearer did:plc:...' being sent instead of the actual JWT token.

Changed to direct reqwest HTTP calls with explicit headers for blob upload.
The PDS couldn't verify tokens signed by the video service because it
doesn't resolve did:web DIDs for external services.

New approach: client requests service auth token with aud=pds_did (not
video_service_did), and we forward that token directly to the PDS.
The PDS can verify it since it's signed by the user's own signing key.
Previously wintermute only processed #commit events from the firehose,
ignoring identity changes, account status updates, and sync events.

Changes:
- Add IdentityData and AccountData types to FirehoseEvent
- Parse #identity events and update actor handles via DID resolution
- Parse #account events and update actor upstream_status
- Parse #sync events and refresh handles (like identity events)
- Update all tests to include new FirehoseEvent fields

This fixes handle changes not being reflected in the appview.
Changed ON CONFLICT DO NOTHING to ON CONFLICT DO UPDATE for records
that can be legitimately updated by users:
- profile: displayName, description, avatarCid, bannerCid
- feed_generator: displayName, description, avatarCid
- list: name, description, avatarCid
- starter_pack: name

Also fixed batch_insert_profiles to include avatarCid and bannerCid
columns which were previously missing.

This fixes the bug where profile updates (like changing avatar/bio)
were not being reflected in the appview because the original record
was kept due to ON CONFLICT DO NOTHING.
Updated all 6 notification INSERT statements to use:
ON CONFLICT (did, "recordUri", reason) DO NOTHING

This prevents duplicate notifications when the same event is
processed multiple times (e.g., from live firehose and backfill).

Requires adding unique index on notification table:
CREATE UNIQUE INDEX notification_unique_idx
ON notification (did, "recordUri", reason);
The ON CONFLICT (did, recordUri, reason) clause requires a unique
index to exist, otherwise PostgreSQL throws an error. Reverting
to ON CONFLICT DO NOTHING until the unique index can be created
during a maintenance window.
Added ON CONFLICT (did, recordUri, reason) DO NOTHING to all 6
notification INSERT statements. Works with the notification_unique_idx
index that prevents duplicate notifications from being created.

This fixes the issue where the same notification could appear multiple
times with the same indexedAt timestamp due to parallel processing or
retries.
Changed snake_case to camelCase to match PostgreSQL schema:
- indexed_at -> indexedAt
- upstream_status -> upstreamStatus
Changed from ON CONFLICT ON CONSTRAINT actor_block_unique_subject
to ON CONFLICT (creator, subjectDid) because there are two unique
constraints on the same columns and the insert was hitting the other one.
The table has multiple unique constraints (uri PK, plus two on creator/subjectDid).
ON CONFLICT DO NOTHING handles conflicts on any of them.
Changed all 6 notification INSERT statements to use
ON CONFLICT (did, "recordUri", reason) DO NOTHING
instead of just ON CONFLICT DO NOTHING.

This requires the unique index notification_unique_idx to exist on the
notification table with columns (did, "recordUri", reason).
Previously only the direct parent author received a notification for
replies. Bluesky also notifies the thread root author (the person who
started the thread) for any reply in their thread, up to 5 levels deep.

This change adds a notification to the root post author when:
- The reply has a root that differs from the parent (nested reply)
- The root author is not the same as the post creator

This matches Bluesky's behavior where users see notifications for
replies anywhere in threads they started, not just direct replies.
Replaces the simple parent+root reply notification with the official
Bluesky behavior from post.ts notifsForInsert:

1. Mention notifications: Parse post facets and create notifications
   for app.bsky.richtext.facet#mention features. Previously mentions
   were not generating any notifications.

2. Reply ancestor walk: Use recursive CTE to walk up the thread
   ancestor chain up to REPLY_NOTIF_DEPTH (5 levels), notifying
   each ancestor author. This matches the official behavior where
   users get notified for replies anywhere in their thread, not
   just direct replies.

3. Descendant notifications for out-of-order indexing: When a post
   in the middle of a thread is indexed after its replies, notify
   ancestors about existing descendant replies. Uses recursive CTE
   to find descendants, then cross-products with ancestors where
   depth + height < REPLY_NOTIF_DEPTH.

Deduplication is handled by ON CONFLICT (did, recordUri, reason)
DO NOTHING on all notification inserts.
Tracing mention notification inserts to debug missing
mention notifications in production.
The label ingestion pipeline was missing the neg field entirely,
causing negation labels (neg: true) to be stored as neg: false.
This meant labels that Bluesky removed were never actually negated
in our database.

Changes:
- Add neg, cid, exp fields to Label and RawLabel structs
- Parse neg from CBOR label messages (defaults to false if absent)
- Update indexer INSERT to use actual neg value instead of hardcoded false
- ON CONFLICT now updates neg, cts, and exp (matching TS dataplane)
- Log negation labels at info level for visibility
- Add test_label_negation and test_parse_label_message_with_negation
- Fix clippy if_not_else lint in mention notification code
… app.bsky.verification.proof

The indexer was matching on a nonexistent collection type app.bsky.verification.proof
instead of the correct app.bsky.graph.verification lexicon. This caused all verification
records from the firehose to be silently ignored, leaving the verification table empty.

Also fixed the URI format strings in index_verification and delete_verification to use
the correct collection path.
Tracks remaining items:
- Hydration-time CID verification in client
- Firehose listener for orphaned content cleanup
- Community post threadgate support
- Community feed aggregation
- Content expiration policy
Move all rsky-video SQL queries to reference the videos schema
(videos.video_jobs, videos.upload_quotas, videos.video_mappings).
Add CREATE SCHEMA IF NOT EXISTS videos to migrations. Includes
cargo fmt formatting fixes across rsky-video.
When PINNED_POST_URI is set, the configured post is inserted at
position 0 of the feed on first-page requests only (no cursor).
Paginated scroll requests are unaffected. Banned users continue
to see only the banned notice post.
Notification inserts were using .await? which caused the entire
indexing function to bail out on failure. This prevented post_agg,
profile_agg, and feed_agg updates from running. Changed all 8
notification INSERT sites to log warnings instead of propagating
errors, ensuring aggregate count updates always complete.

Root cause: notification_id_seq hit 32-bit integer max (2147483647),
causing all notification inserts to fail and cascading to block
all aggregate count updates across the appview.
@rudyfraser rudyfraser merged commit 4573b07 into main Mar 11, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant